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Amendments to the Claims: 

This listing of claims will replace all prior versions, and listings of claims in the 
application. Applicant has submitted a new complete claim set showing marked up 
claims with insertions indicated by underlining and deletions indicated by strikeouts 
and/or double bracketing. 

Listing of Claims: 

1 . (Currently amended) A method of estimating selectivity of a given string predicate of 
length n in a database query, comprising: 

a) estimating selectivities of a plurality of string predicate substrings , the 
plurality of string predicate substrings including substrings of the given string predicate 
and having each of var i ous substring lefigth slength between q to n, where q < n : 

b) categorizing each of the string predicate substrings based on length: 

be) selecting a -one candidate substring for each category of substring length 
based on estimated selectivities of the substrings to obtain a plurality of candidate 
identifying substrings, each candidate identifying substring in the plurality of identifying 
substrings having a different length between q and n ; 

ed) combining the estimated selectivities of each of the candidate substringsjn 
the plurality of identifying substrings : and 

de) returning the combined estimated selectivities of the candidate substrings as 
the estimated selectivity of the given string predicate. 

2. (Original) The method of claim 1 further comprising storing selectivity information for 
the database and using stored selectivity information to estimate the selectivities of the 
substrings of various lengths. 
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3. (Original) The method of claim 1 wherein a substring with a lowest estimated 
selectivity is selected as the candidate substring at each length. 

4. (Original) The method of claim 1 further comprising calculating exact selectivities of 
substrings of a given maximum length and using the exact selectivities to estimate the 
selectivities of the substrings of various substring lengths. 

5. (Original) The method of claim 4 wherein a range of the various substring lengths 
whose selectivities are estimated is between the given maximum length of the 
substrings whose selectivities are calculated exactly and the length of the given string 
predicate. 

6. (Original) The method of claim 4 wherein the candidate substring for the length equal 
to the given maximum length of the substrings whose selectivities are calculated exactly 
is selected based on the exact selectivity of the substring. 

7. (Original) The method of claim 1 wherein a q-gram table is constructed for substrings 
of a given maximum length and is accessed to estimate selectivities of substrings of 
various substring lengths. 

8. (Original) The method of claim 4 wherein a markov estimator uses the exact 
selectivities to estimate the selectivities of the substrings of various substring lengths. 
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9. (Original) The method of claim 1 wherein characteristics of string values in a relation 
of the database are used to combine the estimated selectivities of the candidate 
substrings. 

10. (Original) The method of claim 1 wherein characteristics of a workload of queries are 
used to combine the estimated selectivities of the candidate substrings. 

1 1 . (Original) The method of claim 1 wherein a model for combining the estimated 
selectivities of candidate substrings is learned from query workloads. 

12. (Original) The method of claim 1 wherein said model is applied to the candidate 
substrings at run time to estimate the string predicate selectivity. 

1 3. (Original) The method of claim 1 wherein the given string predicate is a unit 
predicate. 

1 4. (Original) The method of claim 1 wherein the given string predicate includes a 
wildcard character. 

1 5. (Original) The method of claim 1 wherein the given string predicate is a range 
predicates. 

1 6. (Original) The method of claim 1 wherein weights are assigned to each length of 
candidate substring to combine the selectivities of the candidate substrings. 
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1 7. (Original) The method of claim 1 6 wherein a function for assigning said weights is 
learned from data sets of the database. 

1 8. (Original) The method of claim 1 6 wherein a function for assigning said weights is 
learned from an expected query workload. 

19. (Original) The method of claim 1 6 further comprising calculating actual selectivities 
of substrings of queries from an expected workload and determining estimated 
selectivities of the substrings of a queries from the expected workload to learn a 
function for assigning said weights. 

20. (Original) The method of claim 1 6 further comprising calculating for a string 
predicate of a query from an expected workload an actual selectivity of a candidate 
substring having the given length, determining for the string predicate of the query 
from the expected workload an estimated selectivity of the candidate substring having 
the given length, and assigning a weight to candidate substrings of a given length by 
based on a relationship between the calculated actual selectivity and the determined 
estimated selectivity. 

21 . (Original) The method of claim 1 wherein selectivities of the candidate substrings 
are combined using regression trees. 

22. (Original) The method of claim 20 wherein said regression trees are learned from 
data sets of the database. 
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23. (Original) The method of claim 20 wherein said regression trees are learned from an 
expected query workload. 

24. (Currently amended) A computer readable medium having computer executable 
instructions stored thereon for performing a method of estimating selectivity of a given 
string predicate of length n in a database query, the method comprising: 

a) estimating selectivities of a plurality of substrings, the plurality of string 
predicate substrings including substrings of the given string predicate and having each 
of var i ous substring lefreftfrs -length between q to n. where q < n : 

b) categorizing each of the string predicate substrings based on length; 
c}_selecting a- one candidate substring for each category of substring length 

based on estimated selectivities of the substrings to obtain a plurality of candidate 
identifying substrings, each candidate identifying substring in the plurality of identifying 
substrings having a different length between q and n : 

€d) combining the estimated selectivities of each of the candidate substringsJn 
the plurality of identifying substrings ; and 

4e) returning the combined estimated selectivities of the candidate substrings as 
the estimated selectivity of the given string predicate. 

25. (Original) The computer readable medium of claim 24 wherein the method further 
comprises storing selectivity information for the database and using stored selectivity 
information to estimate the selectivities of the substrings of various lengths. 

26. (Original) The computer readable medium of claim 24 wherein a substring with a 
lowest estimated selectivity is selected as the candidate substring at each length. 
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27. (Original) The computer readable medium of claim 24 wherein the method further 
comprises calculating exact selectivities of substrings of a given maximum length and 
using the exact selectivities to estimate the selectivities of the substrings of various 
substring lengths. 

28. (Original) The computer readable medium of claim 27 wherein a range of the various 
substring lengths whose selectivities are estimated is between the given maximum 
length of the substrings whose selectivites are calculated exactly and the length of the 
given string predicate. 

29. (Original) The computer readable medium of claim 27 wherein the candidate 
substring for the length equal to the given maximum length of the substrings whose 
selectivities are calculated exactly is selected based on the exact selectivity of the 
substring. 

30. (Original) The computer readable medium of claim 24 wherein a q-gram table is 
constructed for substrings of a given maximum length and is accessed to estimate 
selectivities of substrings of various substrings lengths. 

31 . (Original) The computer readable medium of claim 28 wherein a markov estimator 
uses the exact selectivities to estimate the selectivities of the substrings of various 
substring lengths. 

32. (Original) The computer readable medium of claim 24 wherein characteristics of 
string values in a relation of the database are used to combine the estimated 
selectivities of the candidate substrings. 
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33. (Original) The computer readable medium of claim 24 wherein characteristics of a 
workload of queries are used to combine the estimated selectivities of the candidate 
substrings. 

34. (Original) The computer readable medium of claim 24 wherein a model for 
combining the estimated selectivities of candidate substrings is learned from query 
workloads. 

35. (Original) The computer readable medium of claim 24 wherein said model is applied 
to the candidate substrings at run time to estimate the string predicate selectivity. 

36. (Original) The computer readable medium of claim 24 wherein the given string 
predicate is a unit predicate. 

37. (Original) The computer readable medium of claim 24 wherein the given string 
predicate includes a wildcard character. 

38. (Original) The computer readable medium of claim 24 wherein the given string 
predicate is a range predicates. 

39. (Original) The computer readable medium of claim 24 wherein weights are assigned 
to each length of candidate substring to combine the selectivities of the candidate 
substrings. 
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40. (Original) The computer readable medium of claim 39 wherein a function for 
assigning said weights is learned from data sets of the database. 

41 . (Original) The computer readable medium of claim 39 wherein a function for 
assigning said weights is learned from an expected query workload. 

42. (Original) The computer readable medium of claim 39 wherein the method further 
comprises calculating actual selectivities of substrings of queries from an expected 
workload and determining estimated selectivities of the substrings of a queries from the 
expected workload to learn a function for assigning said weights. 

43. (Original) The computer readable medium of claim 39 wherein the method further 
comprises calculating for a string predicate of a query from an expected workload an 
actual selectivity of a candidate substring having the given length, determining for the 
string predicate of the query from the expected workload an estimated selectivity of the 
candidate substring having the given length, and assigning a weight to candidate 
substrings of a given length by based on a relationship between the calculated actual 
selectivity and the determined estimated selectivity. 

44. (Original) The computer readable medium of claim 24 wherein selectivities of the 
candidate substrings are combined using regression trees. 

45. (Original) The computer readable medium of claim 44 wherein said regression trees 
are learned from data sets of the database. 
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46. (Original) The computer readable medium of claim 44 wherein said regression trees 
are learned from an expected query workload. 
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